VectorPro(TM) Copyright 1994 Vector Numerics, Inc. 831 N 350 West West Lafayette, IN 47906-4718 USA Version 1.10 all rights reserved Member, Association of Shareware Professionals The following trademarks referenced within this manual are owned by the following people: Microsoft, MS-DOS, Windows, Visual C++, MASM, QuickBASIC Microsoft, Inc. Phar Lap, DOS-Extender Phar Lap Software, Inc. VectorPro, Vector Numerics, Inc. VectorPro Table of Contents Introduction Intro-1 Installation Intro-2 VectorPro Libraries and Files Intro-3 Compiler Compatibility Intro-5 C vs. Assembler Intro-6 Programming with VectorPro Intro-7 Phar Lap Compatibility Intro-9 Complex Numbers Complex-1 cpx_abs Absolute Val of a Complex Number Complex-2 cpx_add Add Two Complex Numbers Complex-3 cpx_argument Finds the Argument (Angle) Complex-4 cpx_convert Converts Arg:Length form to (x,y) Complex-5 cpx_cos Cosine of a Complex Number Complex-6 cpx_cpx_power Complex Number to a Complex Power Complex-7 cpx_div Divide Two Complex Numbers Complex-8 cpx_exp e to a Complex Power Complex-9 cpx_ln Naltural Log of a Complex Number Complex-10 cpx_mult Multiply Two Complex Numbers Complex-11 cpx_power Complex Number to a Real Power Complex-12 cpx_sin Sine of a Complex Number Complex-13 cpx_sqrt Square Root of a Complex Number Complex-14 cpx_sub Subtract Two Complex Numbers Complex-15 Interpolation Inter-1 cubic_spline Init Cubic Spline Interpolation Inter-2 cubic_spline_int Perform Cubic Spline Interpolation Inter-3 Matrix Operations Matrix-1 mat_add Add Two Matrices Matrix-5 mat_colop Perform a Basic Column Operation Matrix-6 mat_copy Copy a Matrix Matrix-7 mat_determinant Find the Determinant of a Matrix Matrix-8 mat_errest Estimate Error for Ax=b Solutions Matrix-9 mat_ident Initialize an Identity Matrix Matrix-11 mat_inverse Invert a Matrix Matrix-12 mat_inverse_errest Invert a Matrix with Err Estimate Matrix-13 mat_lud LU-Decomposition Matrix-15 mat_mult Multiply Two Matrices Matrix-17 mat_rowop Perform a Basic Row Operation Matrix-18 mat_scalar Multiply a Matrix by a Scalar Matrix-19 mat_solve Solve the Matrix Equation Ax=b Matrix-20 mat_sub Subtract Two Matrices Matrix-21 mat_trace Find the Trace of a Matrix Matrix-22 mat_transpose Transpose a Matrix Matrix-23 mat_zero Set a Matrix to Zero Matrix-24 Polynomials - Real Poly-1 poly_add Add Two Polynomials Poly-2 poly_copy Copy a Polynomial Poly-3 poly_deriv Derivative of a Polynomial Poly-4 Contents - 1 poly_div Divide Two Polynomials Poly-5 poly_eval Evaluate a Polynomial Poly-6 poly_integ Integral of a Polynomial Poly-7 poly_lin_change Linear Change of Variable Poly-8 poly_mult Multiply Two Polynomials Poly-9 poly_sub Subtract Two Polynomials Poly-10 poly_zero Set a Polynomial to Zero Poly-11 Polynomials - Complex Poly cpx-1 poly_cpx_add Add Two Polynomials Poly cpx-2 poly_cpx_convert Convert Real Poly to Complex Poly cpx-3 poly_cpx_copy Copy a Polynomial Poly cpx-4 poly_cpx_deriv Derivative of a Polynomial Poly cpx-5 poly_cpx_div Divide Two Polynomials Poly cpx-6 poly_cpx_eval Evaluate a Polynomial Poly cpx-7 poly_cpx_integ Integral of a Polynomial Poly cpx-8 poly_cpx_lin_change Linear Change of Variable Poly cpx-9 poly_cpx_mult Multiply Two Polynomials Poly cpx-10 poly_cpx_root Find a Root of a Polynomial Poly cpx-11 poly_cpx_roots Find All Roots of a Polynomial Poly cpx-12 poly_cpx_sub Subtract Two Polynomials Poly cpx-13 poly_cpx_zero Set a Polynomial to Zero Poly cpx-14 Sorting and Searching Sort-1 binary_search Binary Search on a Real Array Sort-2 binary_search_i Binary Search on a Real Array Sort-3 heap_sort Sort One or More Arrays Sort-5 Trigonometry Trig-1 sincos Sine and Cosine of an Angle Trig-2 Vectors Vectors-1 vec_add Add Two Vectors Vectors-2 vec_ange Angle Between Two Vectors Vectors-3 vec_copy Copy a Vector Vectors-4 vec_crossprod Cross Product of Two Vectors Vectors-5 vec_dotprod Dot Product of Two Vectors Vectors-6 vec_gendotprod Dot Product using Structures Vectors-7 vec_magnitude Find the Magnitude of a Vector Vectors-8 vec_scalar Multiply a Vector by a Scalar Vectors-9 vec_sub Subtract Two Vectors Vectors-10 vec_total Total of Vectors*Scalars Vectors-11 vec_unit Unit Vector Parallel to a Vector Vectors-12 vec_zero Set a Vector to Zero Vectors-13 Appendices Vector Numerics, Inc. License Agreement for VectorPro Support Policies ASP Ombudsman Statement Vector Numerics, Inc. Custom Programming Order Form VectorPro Order From Contents - 2 Introduction to VectorPro Congratulations on getting VectorPro, the high performance numerical package for the Intel xxx86 family. VectorPro was written in Microsoft Visual C++ version 1.5, and in assembly language for an extra boost in speed. VectorPro has been exhaustively tested for accuracy, and makes sure to flag all error and warning conditions. In addition, assembly language versions of the libraries provide greater speed, accuracy, and portability between compilers than the C version. You can use them to help build into your programs the snappy response times today's users expect! This chapter introduces VectorPro, and covers general subjects you need to know no matter which routines you actually call. The reference section gives detailed information about each function. Intro - 1 Installation VectorPro installation simply involves copying the libraries you want to use off of the distribution disks into a subdirectory on your hard disk. If you create a new directory, be sure to add it to your LIB environment variable. You then need to copy *.H and *.INC to a directory listed in your INCLUDE environment variable, or add the VectorPro Example: > C: > MD\VECPRO > CD\VECPRO > XCOPY A:*.* > XCOPY A:*.* (after inserting disk 2) Add the lines: SET LIB=%LIB%;C:\VECPRO SET INCLUDE=%INCLUDE%,C:\VECPRO to your AUTOEXEC.BAT file. Intro - 2 The VectorPro Libraries and files VectorPro consists of the following libraries: SVECPRO.LIB SVECPROA.LIB SVECPRO7.LIB MVECPRO.LIB MVECPROA.LIB MVECPRO7.LIB CVECPRO.LIB CVECPROA.LIB CVECPRO7.LIB LVECPRO.LIB LVECPROA.LIB LVECPRO7.LIB HVECPRO.LIB HVECPROA.LIB HVECPRO7.LIB PVECPROA.LIB PVECPRO7.LIB The first letter of the library name represents the memory model: S = Small or Tiny M = Medium C = Compact L = Large H = Huge P = Phar Lap The last character, if there is one, represents the language requirements of the library: None = compiled with the C compiler. A = Assembled using floating point emulation. 7 = Assembled, requires xxx87 hardware. Note: only SVECPRO.LIB is included in the regular shareware distribution. Intro - 3 Other VectorPro Files: VECPRO.H Definitions and function prototypes for VectorPro VECPROCP.H Definitions and function prototypes for use with C++ Files in the shareware distribution: VPORDER.TXT Order form for registering VectorPro VPCUSTOM.TXT Order form for custom programming LICENSE.TXT Vector Numerics, Inc. License Agreement for VectorPro OMBUDSMN.TXT Association of Shareware Professionals Ombudsman Statement SUPPORT.TXT Vector Numerics Inc. support policy VECPRO.TXT The VectorPro manual FILE_ID.DIZ Description file for posting on BBS systems README.TXT The VectorPro "Read Me" file. Intro - 4 Compiler Compatibility VectorPro was compiled with Microsoft Visual C++, version 1.5, and may be used with either the C or the C++ sides of that compiler. However, it does not perform any I/O to the disk or screen, and calls only mathematical functions like sin(). It Therefore is compatible with recent versions of Microsoft's C compilers. If you get LINK errors for strange undefined symbols like __AFNLMUL, your version of the C compiler is too old. You must either upgrade your C compiler or stop using the C versions of the VectorPro libraries. The assembler versions of the libraries do not call any routines out of the C library, except the floating point emulator. You may therefore use them with any version of Microsoft C. The 7 versions of the libraries require the computer that executes the program to have floating point capability, either as a 486DX or with an add-on xxx87 chip. These library versions reference no symbols defined outside of themselves, and therefore may be used with any compiler or assembler. VectorPro routines may be called from other languages by following the interfacing instructions in Microsoft's Mixed Language Programming Guide. You may be forced to use the assembler versions of the libraries if you get undefined symbols during LINK. Intro - 5 C vs. Assembler The assembler versions of the VectorPro libraries implement the same calculation logic as the C versions. They will, however, often produce slightly different results. The difference is due to a different use of the internal architecture of the xxx87 chip between the C and Assembler versions. The xxx87 chip provides eight internal registers capable of holding a floating point number in a special ten byte format, which provides greater accuracy than the normal eight byte IEEE format used by compilers. When a number is moved into the xxx87 chip, it is automatically expanded into the ten byte format, and all subsequent computations are done using the ten byte format, until the result is saved to memory. At that point, the result is rounded to fit into the eight byte format again. Continually rounding the intermediate results of a calculation as numbers are moved in and out of the FPU takes time, loses accuracy, and increases the memory required for the program. By making more effective use of the eight floating point registers, the assembled versions can follow the same essential logic as the C versions, while gaining an improvement in all three areas. Many important routines run nearly twice as fast in the assembled versions. As an additional gain in accuracy, the assembled versions can store intermediate results in memory in the ten byte format, so no rounding needs to occur when these numbers must be shifted between memory and the FPU. You will not normally have to do anything, since these numbers are normally saved into local memory areas within the functions. There is a structure tenbytes defined in VECPRO.H so you can pass arrays of ten byte areas to some functions which require them. (The C versions of these functions still just use eight of the ten bytes.) Therefore, you will, if you check, find different results between the C and assembler versions. Both versions pass the tolerances in the test routines here at Vector Numerics, Inc., but the assembled versions provide the most accurate results. Intro - 6 Programming with VectorPro Initializing: At the beginning of each module, include the VECPRO.H file. If you will be using the polynomial functions, and you want to store polynomials higher than the 20th degree, add a define statement for the constant VP_POLYNOMIAL_MAX_ORDER before the include of VECPRO.H. At the beginning of the main program, you must call the vecinit function. Here is a sample main program: #define VP_POLYNOMIAL_MAX_ORDER 30 #include "VECPRO.H" etc. main() { vecinit(VP_POLYNOMIAL_MAX_ORDER); /* We are now ready to use VECTORPRO !!! */ etc. } The vecinit routine must be called even if you are not using the polynomial functions. VECPRO.H will also include the following C library include files for you: math.h float.h stdlib.h errno.h Intro - 7 Calling VectorPro Functions: All VectorPro functions return a short integer as an error code. The error codes possible from each function are listed as a part of that function's documentation. All return values are done via pointer parameters to the functions. Pointers are used instead of passing results through the function value because that is faster. All input parameters less than or equal in size to doubles are passed by value. Pointers to the input data are passed for complex numbers, arrays, and structures. When a VectorPro routine requires temporary memory to do its job, a pointer to the memory you want it to use must be passed. The documentation for each function specifies the size of the memory area needed. This approach was taken, instead of using malloc internally, to allow you better control over the memory usage of your program. Calling malloc internally would provide a little cleaner interface, but would inevitably lead to situations where a function would fail because malloc couldn't provide enough memory, but the main program has areas under its control which could be overwritten. Error Recovery: Special versions with names ending in _e are provided for each VectorPro function. The _e versions return VP_INTERNAL_ERROR when an unexpected error like using an invalid number occurs. The normal version of the function will blow up. The _e versions have to spend more time setting up the error trap, so you get the choice of which version you want to use. Intro - 8 Phar Lap Compatibility Two libraries, PVECPROA.LIB and PVECPRO7.LIB are provided for use with the Phar Lap DOS Extender. If you have no data objects greater than 64K, you may use any of the L libraries. You may also link in HVECPRO.LIB if you want to use the C versions with huge data objects. Be sure not to straddle the boundary between 64K segments of huge data objects with any individual elements. Each number or complex number must be entirely within a 64K segment. This applies to Huge model programming as well. Examples: struct bad_struct { char c[65533]; double vector[19]; }; struct good_struct { char c[65528]; double vector[19]; }; Intro - 9 COMPLEX Numbers VectorPro includes a large set of functions for performing operations on complex numbers. These routines use the complex structure defined in math.h: struct complex { double x; /* Real Part */ double y; /* Imaginary Part */ }; All input parameters to the functions are passed as pointers. If this structure is and element of a huge structure, you must be sure the imaginary part is in the same segment as the real part. Terms: absolute value: The length of the line between (0,0) and the complex number. Also known as magnitude. argument: The angle made between the line from (0,0) to the complex number and the x axis. VP_ACC_TLOSS and VP_ACC_PLOSS errors: These errors are flagged because accuracy is lost in the sin() and cos() functions needed by various complex functions for larger angles. This is a natural property of the sin() and cos() functions. If these errors are returned, try to approach the problem a different way to avoid the source of the large angles. Subtracting a multiple of pi will not really help. The inaccuracy arises from the operation of subtracting a large multiple of pi from the angle. Doing this yourself only fools the function, and leaves you with an inaccurate calculation. Complex - 1 cpx abs cpx_abs(double * r, struct complex * a); cpx_abs_e(double * r, struct complex * a); Sets the double r to abs(a). Example: struct complex a; double r; short i; main() { a.x = 5; a.y = -5; if (i = cpx_abs_e(&r, &a)) { /* Handle error i here */ } /* r = abs(a) */ } Possible Errors: NONE Complex - 2 cpx add cpx_add(struct complex * r, struct complex * a, struct complex * b); cpx_add_e(struct complex * r, struct complex * a, struct complex * b); Sets complex number r to a + b. Example: struct complex a, b, r; short i; main() { a.x = 17; a.y = -8; b.x = 2; b.y = 6; if (i = cpx_add_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a + b */ } Possible Errors: NONE Complex - 3 cpx argument cpx_argument(double * r, struct complex * a); cpx_argument_e(double * r, struct complex * a); Sets the double r to argument(a). Example: struct complex a; double r; short i; main() { a.x = -5; a.y = 5; if (i = cpx_argument_e(&r, &a)) { /* Handle error i here */ } /* r = argument(a) */ } Possible Errors: VP_OUT_OF_RANGE a = (0,0) Complex - 4 cpx convert cpx_convert(struct complex * r, double * abs, double arg); cpx_convert_e(struct complex * r, double * abs, double arg); Sets the complex number r to the (x,y) form corresponding to an absolute value abs and argument arg. Example: struct complex r; double abs, arg; short i; main() { abs = 5; arg = 6; if (i = cpx_convert_e(&r, abs, arg)) { /* Handle error i here */ } /* r = (x,y) form of complex number at angle arg, length abs */ } Possible Errors: VP_ACC_TLOSS |arg| > 100000000 VP_ACC_PLOSS |arg| > 10000 Complex - 5 cpx cos cpx_cos(struct complex * r, struct complex * a); cpx_cos_e(struct complex * r, struct complex * a); Sets complex number r to cos(a). Example: struct complex a, r; short i; main() { a.x = -1; a.y = 83; if (i = cpx_cos_e(&r, &a) { /* Handle error i here */ } /* r = cos(a) */ } Possible Errors: VP_OVERFLOW |a.x| > 706 VP_ACC_TLOSS |a.y| > 100000000 VP_ACC_PLOSS |a.y| > 10000 Complex - 6 cpx cpx power cpx_cpx_power(struct complex * r, struct complex * a, struct complex * b); cpx_cpx_power_e(struct complex * r, struct complex * a, struct complex * b); Sets complex number r to a ^ b. Example: struct complex a, b, r; short i; main() { a.x = 12; a.y = 54; b.x = 2; b.y = 6; if (i = cpx_cpx_power_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a ^ b */ } Possible Errors: VP_OUT_OF_RANGE a = (0,0) VP_OVERFLOW VP_ACC_TLOSS VP_ACC_PLOSS TLOSS, PLOSS, and OVERFLOW may occur when a and/or b are too large. Complex - 7 cpx div cpx_div(struct complex * r, struct complex * a, struct complex * b); cpx_div_e(struct complex * r, struct complex * a, struct complex * b); Sets complex number r to a / b. Example: struct complex a, b, r; short i; main() { a.x = 1; a.y = 8; b.x = 12; b.y = 62; if (i = cpx_div_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a / b */ } Possible Errors: VP_DIV_BY_0 Complex - 8 cpx exp cpx_exp(struct complex * r, struct complex * a); cpx_exp_e(struct complex * r, struct complex * a); Sets complex number r to e^a. Example: struct complex a, r; short i; main() { a.x = 1; a.y = 8; if (i = cpx_exp_e(&r, &a)) { /* Handle error i here */ } /* r = e ^ a */ } Possible Errors: VP_OVERFLOW VP_ACC_TLOSS VP_ACC_PLOSS TLOSS, PLOSS, and OVERFLOW may occur if a is too large. Complex - 9 cpx ln cpx_ln(struct complex * r, struct complex * a); cpx_ln_e(struct complex * r, struct complex * a); Sets complex number r to ln(a). Example: struct complex a, r; short i; main() { a.x = 11; a.y = -8; if (i = cpx_ln_e(&r, &a)) { /* Handle error i here */ } /* r = ln(a) */ } Possible Errors: VP_OVERFLOW a = (0,0) Complex - 10 cpx mult cpx_mult(struct complex * r, struct complex * a, struct complex * b); cpx_mult_e(struct complex * r, struct complex * a, struct complex * b); Sets complex number r to a * b. Example: struct complex a, b, r; short i; main() { a.x = 7; a.y = -4; b.x = 22; b.y = -3; if (i = cpx_mult_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a * b */ } Possible Errors: NONE Complex - 11 cpx power cpx_power(struct complex * r, struct complex * a, double exp); cpx_power_e(struct complex * r, struct complex * a, double exp); Sets complex number r to a ^ exp. Example: struct complex a, r; double exp; short i; main() { a.x = -1; a.y = 83; exp = 4.444; if (i = cpx_power_e(&r, &a, exp) { /* Handle error i here */ } /* r = a ^ exp */ } Possible Errors: VP_PARAMETER_ERROR a = (0,0) and exp=0 VP_INFINITY a = (0,0) and exp<0 VP_OVERFLOW VP_ACC_TLOSS VP_ACC_PLOSS TLOSS, PLOSS, and OVERFLOW occur if a and/or exp are too large. Complex - 12 cpx sin cpx_sin(struct complex * r, struct complex * a); cpx_sin_e(struct complex * r, struct complex * a); Sets complex number r to sin(a). Example: struct complex a, r; short i; main() { a.x = -1; a.y = 83; if (i = cpx_sin_e(&r, &a) { /* Handle error i here */ } /* r = sin(a) */ } Possible Errors: VP_OVERFLOW |a.x| > 706 VP_ACC_TLOSS |a.y| > 100000000 VP_ACC_PLOSS |a.y| > 10000 Complex - 13 cpx sqrt cpx_sqrt(struct complex * r, struct complex * a); cpx_sqrt_e(struct complex * r, struct complex * a); Sets complex number r to sqrt(a). Example: struct complex a, r; short i; main() { a.x = -11; a.y = 5; if (i = cpx_sqrt_e(&r, &a) { /* Handle error i here */ } /* r = sqrt(a) */ } Possible Errors: NONE Complex - 14 cpx sub cpx_sub(struct complex * r, struct complex * a, struct complex * b); cpx_sub_e(struct complex * r, struct complex * a, struct complex * b); Sets complex number r to a - b. Example: struct complex a, b, r; short i; main() { a.x = 2; a.y = 8; b.x = 2; b.y = 2; if (i = cpx_sub_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a - b */ } Possible Errors: NONE Complex - 15 Interpolation The method of interpolation provided by VectorPro is Natural Cubic Spline Interpolation. The input to spline interpolation is a table of x and y values, where y[i] = f(x[i]), and x[i] < x[i+1]. If your data is in random order, you must sort it first with the heap_sort function. Cubic spline interpolation generates an interpolation function to fill in the unspecified points in the range with a function that: 1. Equals y[i] for each i in the table, 2. Is continuous over the entire range, 3. Has continuous first and second derivatives over the entire range. Two functions are involved in performing the interpolation: cubic_spline calculates the parameters of the interpolating function, and need only be called once at the time the table is set up in memory. cubic_spline_int performs the actual interpolation, returning an approximate y(x) for any x within the range of the input table. Inter - 1 cubic spline cubic_spline(double * x, double * y, double * y2, double * temp, short n); cubic_spline_e(double * x, double * y, double * y2, double * temp, short n); Computes the interpolating function parameters y2[i], for the arrays x and y. All the pointers are assumed to point to arrays of n elements. temp is needed only within the function as a work area. x, y, and y2 must be saved for later use by cubic_spline_int. The values in x do not need to be evenly spaced. Example: double x[50], y[50], y2[50], temp[50]; short i; main() { /* Input x and y here */ if (i = cubic_spline_e(x, y, y2, temp, 50)) { /* Handle error i here. */ } /* The table x, y, y2 is now ready for use in interpolation. */ } Possible Errors: VP_UNSORTED the array x is unsorted VP_DUPLICATES an x value is duplicated VP_PARAM_ERROR n < 4 Inter - 2 cubic spline int cubic_spline_int(double * x, double * y, double * y2, short n, double xi, double * yi); cubic_spline_int_e(double * x, double * y, double * y2, short n, double xi, double * yi); This function is used to perform the actual interpolations once cubic_spline has set up the array y2. x, y, and y2. xi is the x value, yi is returned as the interpolation function evaluated at xi. Example: double x[50], y[50], y2[50], temp[50], xi, yi; short i; main() { /* Input x, xi, and y here */ if (i = cubic_spline_e(x, y, y2, temp, 50)) { /* Handle error i here. */ } /* The table x, y, y2 is now ready for use in interpolation. */ if (i = cubic_spline_int_e(x, y, y2, 50, xi, &yi)) { /* Handle error i here. */ } /* yi = f(xi) */ } Possible errors: VP_OUT_OF_RANGE xi is outside the table VP_ACC_PLOSS xi is in the first or last interval. Some loss of accuracy may have occurred. Inter - 3 MATRIX Operations Matrices may be stored in memory in two ways, which we will call the "malloc method" and the "parameter method". A duplicate of each matrix routine is supplied so that you may use the method you want. The two methods may be mixed within one program, but not within a single call to VectorPro routines. If you do need to mix arrays stored by different methods, there is a way to fool the parameter method routines into thinking a malloc array is really a parameter method array. The Malloc Method: In this method, the array elements are stored in a block of memory in row order, one row immediately following another. Pointer arithmetic is generally used to accomplish indexing the individual array elements. A typical use of this method would be that a program decides it needs an m by n array, and uses malloc to allocate an m*n*sizeof(double) byte block of memory to store the array. It then uses the pointer returned by malloc to access array elements. Example: short m = 5; short n = 6; double * a; main() { a = (double *) malloc(m*n*sizeof(double)); i = mat_zero(a, m, n); ... } Matrix - 1 The Parameter Method: In this method, the array is declared in a double statement, and preallocated a certain maximum number of rows and columns. Access to array elements is generally done by subscripting the array name. Not all of the allocated rows and columns need to be used; the declaration only specifies a maximum. The versions of the matrix routines for parameter method arrays all have names ending in "_p", and have one or more extra parameters which tell the maximum number of columns declared in the double statements declaring each array parameter. This method is normally used when the programmer knows in advance how large the arrays will need to be. Example: short m = 5; short n = 6; double a[10][20]; \* max of 10 rows, 20 columns *\ main() { i = mat_zero_p(&a[0][0], m, n, 20); ... } To pass a malloc array to the _p routines, pass the actual number of columns in the array to the corresponding number of columns parameter. This effectively tells it that the maximum number of columns is in use, so it will not attempt to skip over any empty columns. Example: short m = 5; short n = 6; double * a; main() { a = (double *) malloc(m*n*sizeof(double)); i = mat_zero_p(a, m, n, 6); ... } Matrix - 2 The matrix routines all assume that the zero row and column of an array are used. If you start with element [1][1], you must use the _p versions. Pass a pointer to element [1][1] instead of a pointer to the beginning of the array. Example: short m = 5; short n = 6; double a[10][20]; \* max of 10 rows, 20 columns *\ main() { i = mat_zero_p(@a[1][1], m, n, 20); ... } For simplicity, all examples in the rest of this section will assume the parameter method is used, and that the zero row and column are used. Matrix - 3 Array Decomposition: VectorPro uses a technique called LU-Decomposition instead of relying on matrix inversion. Many are familiar with the following technique of solving the equation: Ax = b 1. Invert A 2. x = A-1b LU-Decomposition of the array is considerably faster, and, since it does not do as much computation as inversion, allows less opportunity for roundoff errors to accumulate into the result. The output of LU-Decomposition is a modified n by n array, an array of short integers containing one element per row of the array, and a short integer used to calculate the sign of the determinant. These three outputs must be saved and passed to other matrix routines for use along with the decomposed array. A routine is provided which uses LU-Decomposition as an intermediate step in computing the inverse, if the inverse itself is needed. No loss of time is involved in computing the inverse in this way. Example: double a[10][10], b[10], x[10], t[10]; short i, dsign, pivots[10]; struct tenbytes tb[10]; main() { ... /* setup data in 10 by 10 matrix a, and vector b */ if (i = mat_lud_p(&a[0][0], 10, pivots, &dsign, t, tb, 10)) { /* Handle error i */ } if (i = mat_solve_p(&a[0][0], 10, pivots, x, b, 10)) { /* Handle error i */ } /* x is now the solution of the matrix equation Ax = b */ } Matrix - 4 mat add mat_add(double * r, double * a, double * b, short nrows, short ncols); mat_add_e(double * r, double * a, double * b, short nrows, short ncols); mat_add_p(double * r, double * a, double * b, short nrows, short ncols, short rsiz_c, short asiz_c, short bsiz_c); mat_add_p_e(double * r, double * a, double * b, short nrows, short ncols, short rsiz_c, short asiz_c, short bsiz_c); Sets the matrix r to the sum of matrix a and matrix b. Each matrix must contain the same number of rows and columns. Example: double r[5][5], a[5][5], b[5][5]; main() { ... if (i = mat_add_p_e(&r[0][0], &a[0][0], &b[0][0], 5, 5, 5, 5, 5)) { /* Handle error i */ } /* Matrix r now equals matrix a + matrix b */ ... } Possible Errors: NONE Matrix - 5 mat colop mat_colop(double * r, short nrows, short ncols, short change_col, double x, short from_col); mat_colop_e(double * r, short nrows, short ncols, short change_col, double x, short from_col); mat_colop_p(double * r, short nrows, short ncols, short change_col, double x, short from_col, short rsiz_c); mat_colop_p_e(double * r, short nrows, short ncols, short change_col, double x, short from_col, short rsiz_c); Performs a basic column operation on matrix r. x times column from_col is added to column change_col. Example: double r[5][6]; main() { ... if (i = mat_colop_p_e(&r[0][0], 5, 6, 2, (double) 4.44, 4)) { /* Handle error i */ } /* Adds 4.44*column 4 to column 2 of matrix r */ ... } Possible Errors: NONE Matrix - 6 mat copy mat_copy(double * r, double * a, short nrows, short ncols); mat_copy_e(double * r, double * a, short nrows, short ncols); mat_copy_p(double * r, double * a, short nrows, short ncols, short rsiz_c, short asiz_c); mat_copy_p_e(double * r, double * a, short nrows, short ncols, short rsiz_c, short asiz_c); Sets matrix r equal to matrix a. Both arrays must have the same number of rows and columns. Example: double r[5][5], a[5][5]; main() { ... if (i = mat_copy_p_e(&r[0][0], &a[0][0], 5, 5, 5, 5)) { /* Handle error i */ } /* Matrix r now equals matrix a */ ... } Possible Errors: NONE Matrix - 7 mat determinant mat_determinant(double * r, short n, short * pivots, double * d, short dsign); mat_determinant_p(double * r, short n, short * pivots, double * d, short dsign, short rsiz_c); Finds the determinant of matrix r. It is assumed that r has been decomposed by mat_lud, and the pivots array and dsign were saved by mat_lud. Finding the determinant is very fast once the decomposition is done. The matrix r must be square, with n rows and columns. Example: double r[10][10], t[10], d; short i, dsign, pivots[10]; struct tenbytes tb[10]; main() { ... /* setup data in 10 by 10 matrix r */ if (i = mat_lud_p_e(&r[0][0], 10, pivots, &dsign, t, tb, 10)) { /* Handle error i */ } if (i = mat_determinant_p_e(&r[0][0], 10, pivots, &d, dsign, 10)) { /* Handle error i */ } /* d is now the determinant of the matrix r */ } Possible Errors: NONE* * Determinants may be rather large, so there is a danger of overflow here, especially for large matrices. It is suggested you use the _e version of mat_determinant if you have any doubts. Matrix - 8 mat errest mat_errest(double * orig_a, double * a, short n, short * pivots, double * x, double * b, double * resid, double * err, double * tmp1, double * tmp2); mat_errest_e(double * orig_a, double * a, short n, short * pivots, double * x, double * b, double * resid, double * err, double * tmp1, double * tmp2); mat_errest_p(double * orig_a, double * a, short n, short * pivots, double * x, double * b, double * resid, double * err, double * tmp1, double * tmp2, short origa_siz_c short asiz_c); mat_errest_p_e(double * orig_a, double * a, short n, short * pivots, double * x, double * b, double * resid, double * err, double * tmp1, double * tmp2, short origa_siz_c short asiz_c); Error estimation of the solution for the matrix equation Ax = b. The matrices a and orig_a are n by n square matrices. a is the LU-Decomposition of the array orig_a, and the short array pivots is assumed to be output from mat_lud along with a. Terms used: Residual vector: b - Axcomputed Error vector: xcomputed - xexact, approximated by solving Ae = residual vector Norm: a single number that measures the size of a vector or matrix. the maximum absolute value of the elements is used here. *resid is returned as norm(residual vector) / norm(b) *err is returned as norm(error vector) / norm(xcomputed) tmp1 and tmp2 must point to arrays of double with at least n elements each. They are used as a work area during the computations. Vector b is the original vector b as input to mat_solve. Vector x is the output of mat_solve, and is referred to as xcomputed above. resid and err should be as small as possible for a good solution. 0.000001 or less should be good for most purposes. resid and err may be very different for ill-conditioned matrices. Matrix - 9 Example: double oa[10][10], a[10][10], b[10], x[10], t[10], t2[10], resid, err; short i, dsign, pivots[10]; struct tenbytes tb[10]; main() { ... /* setup data in 10 by 10 matrix oa, and vector b */ mat_copy_p(&a[0][0], &oa[0][0], 10, 10, 10, 10); if (i = mat_lud_p_e(&a[0][0], 10, pivots, &dsign, t, tb, 10)) { /* Handle error i */ } if (i = mat_solve_p_e(&a[0][0], 10, pivots, x, b, 10)) { /* Handle error i */ } /* x is now the solution of the matrix equation Ax = b */ if (i = mat_errest_p_e(&oa[0][0], &a[0][0], 10, pivots, x, b, &resid, &err, t, t2, 10, 10)) { /* Handle error i */ } if (err > 0.000001) printf("WARNING: large error\n"); } Possible errors: NONE Matrix - 10 mat_ident mat_ident(double * r, short n); mat_ident_e(double * r, short n); mat_ident_p(double * r, short n, short rsiz_c); mat_ident_p_e(double * r, short n, short rsiz_c); Sets r to an identity matrix. r has n rows and n columns. Example: double r[10][10]; short i; main() { if (i = mat_ident_e(&r[0][0], 9, 10)) { /* Handle error i here */ } /* r now contains a 9 by 9 identity matrix. */ } Possible Errors: NONE Matrix - 11 mat inverse mat_inverse(double * r, double * a, short n, short * pivots, double * tmp1, double * tmp2); mat_inverse_e(double * r, double * a, short n, short * pivots, double * tmp1, double * tmp2); mat_inverse_p(double * r, double * a, short n, short * pivots, double * tmp1, double * tmp2, short rsiz_c, short asiz_c); mat_inverse_p_e(double * r, double * a, short n, short * pivots, double * tmp1, double * tmp2, short rsiz_c, short asiz_c); Sets r to the inverse of the original matrix a, where a and the pivots array are output from mat_lud for the original a. Both r and a must be square matrices with n rows and columns. tmp1 and tmp2 must point to arrays of double with at least n elements; they are used as temporary space during the calculation. Example: double oa[10][10], a[10][10], a_inv[10][10], t[10], t2[10]; short i, dsign, pivots[10]; struct tenbytes tb[10]; main() { ... /* setup data in 10 by 10 matrix oa */ mat_copy_p_e(&a[0][0], &oa[0][0], 10, 10, 10, 10); if (i = mat_lud_p_e(&a[0][0], 10, pivots, &dsign, t, tb, 10)) { /* Handle error i */ } if (i = mat_inverse_p_e(&a_inv[0][0], &a[0][0], 10, pivots, t, t2, 10, 10)) { /* Handle error i */ } /* a_inv is now the inverse of the original matrix a */ } Possible Errors: NONE Matrix - 12 mat inverse errest mat_inverse_errest(double * r, double * orig_a, double * a, short n, short * pivots, double * resid, double * err, double * tmp1, double * tmp2, double * tmp3, double * tmp4); mat_inverse_errest_e(double * r, double * orig_a, double * a, short n, short * pivots, double * resid, double * err, double * tmp1, double * tmp2, double * tmp3, double * tmp4); mat_inverse_errest_p(double * r, double * orig_a, double * a, short n, short * pivots, double * resid, double * err, double * tmp1, double * tmp2, double * tmp3, double * tmp4, short rsiz_c, short origa_siz_c, short asiz_c); mat_inverse_errest_p_e(double * r, double * orig_a, double * a, short n, short * pivots, double * resid, double * err, double * tmp1, double * tmp2, double * tmp3, double * tmp4, short rsiz_c, short origa_siz_c, short asiz_c); Sets matrix r to the inverse of matrix orig_a, where a and the pivots array are output from mat_lud for matrix orig_a. All three must be square matrices with n rows and columns. Terms used: Residual matrix: I - AA-1computed Error matrix: A-1computed - A-1exact, approximated by solving AE = residual matrix Norm: a single number that measures the size of a vector or matrix. the maximum absolute value of the elements is used here. *resid is returned as norm(residual matrix) / norm(I) *err is returned as norm(error matrix) / norm(A-1computed) tmp1, tmp2, tmp3, and tmp4 must point to arrays of double with at least n elements each. They are used as a work area during the computations. resid and err should be as small as possible for a good solution. 0.000001 or less should be good for most purposes. resid and err may be very different for ill-conditioned matrices. Matrix - 13 Example: double oa[10][10], a[10][10], a_inv[10][10], t[10], t2[10], t3[10] double resid, err, t4[10]; short i, dsign, pivots[10]; struct tenbytes tb[10]; main() { ... /* setup data in 10 by 10 matrix oa, and vector b */ mat_copy_p_e(&a[0][0], &oa[0][0], 10, 10, 10, 10); if (i = mat_lud_p_e(&a[0][0], 10, pivots, &dsign, t, tb, 10)) { /* Handle error i */ } if (i = mat_inverse_errest_p_e(&a_inv[0][0], &oa[0][0], &a[0][0], 10, pivots, &resid, &err, t, t2, t3, t4, 10, 10 10)) { /* Handle error i */ } /* a_inv is now the inverse of the matrix oa */ if (err > 0.000001) printf("WARNING: large error\n"); } Possible errors: NONE Matrix - 14 mat lud mat_lud(double * r, short n, short * pivots, short * dsign, double * tmp, struct tenbytes * tmp2); mat_lud_e(double * r, short n, short * pivots, short * dsign, double * tmp, struct tenbytes * tmp2); mat_lud_p(double * r, short n, short * pivots, short * dsign, double * tmp, struct tenbytes * tmp2, short rsiz_c); mat_lud_p_e(double * r, short n, short * pivots, short * dsign, double * tmp, struct tenbytes * tmp2, short rsiz_c); Performs LU-Decomposition on array r, which must be a square matrix with n rows and columns. If you need to preserve the original array, make a copy first. mat_lud also outputs an array of shorts (pivots) which must have at least n elements, and a short integer (dsign) used to calculate the sign of the determinant. One or both of these outputs may be required input for other VectorPro routines, along with r. tmp and tmp2 must point to arrays with at least n elements available. They are used as a work area. Doolittle's method is used. Doolittle's method is very closely related to Crout's method. Example: double a[10][10], t[10]; short i, dsign, pivots[10]; struct tenbytes tb[10]; main() { ... /* setup data in 10 by 10 matrix a */ if (i = mat_lud_p_e(&a[0][0], 10, pivots, &dsign, t, tb, 10)) { /* Handle error i */ } /* The array a is now ready for use by other matrix routines */ Possible Errors: VP_SINGULAR VP_ILL_CONDITIONED Matrix - 15 If the matrix was found to be singular, the decomposition halts immediately, and the error is returned. The array, and the short integer outputs will be meaningless. If the matrix was found to be ill conditioned, the decomposition is completed, but you must be very careful with the result. See a good text on numerical analysis for a full discussion of singular and ill conditioned matrices. A matrix A is declared ill conditioned if mat_lud feels enough error has accumulated in the calculation such that, if the inverse A-1computed were calculated: (max | I - A*A-1computed |i,j for all i and j) > 0.000001 The tolerance for a singular matrix is 0.01. Please note that these errors flag a property of the matrix itself, not some artifact of the method used. If you get these errors, you need to carefully reformulate the problem to try to get around them. Matrix - 16 mat mult mat_mult(double * r, double * a, double * b, short arows, short acols, short bcols); mat_mult_e(double * r, double * a, double * b, short arows, short acols, short bcols); mat_mult_p(double * r, double * a, double * b, short arows, short acols, short bcols, short rsiz_c, short asiz_c, short bsiz_c); mat_mult_p_e(double * r, double * a, double * b, short arows, short acols, short bcols, short rsiz_c, short asiz_c, short bsiz_c); Sets matrix r = matrix a times matrix b. a has arows rows and acols columns. b has acols rows and bcols columns. r has arows rows and bcols columns. Example: double a[6][7], b[7][8], r[6][8]; short i; main() { /* initialize arrays a and b */ if (i = mat_mult_p_e(&r[0][0], &a[0][0], &b[0][0], 6, 7, 8, 8, 8, 7)) { /* Handle error i here */ } /* r now = a * b */ } Possible Errors: NONE Matrix - 17 mat rowop mat_rowop(double * r, short nrows, short ncols, short change_row, double x, short from_row); mat_rowop_e(double * r, short nrows, short ncols, short change_row, double x, short from_row); mat_rowop_p(double * r, short nrows, short ncols, short change_row, double x, short from_row, short rsiz_c); mat_rowop_p_e(double * r, short nrows, short ncols, short change_row, double x, short from_row, short rsiz_c); Performs a basic row operation on matrix r. x times row from_row is added to row change_row. Example: double r[5][6]; main() { ... if (i = mat_rowop_p_e(&r[0][0], 5, 6, 2, (double) 4.44, 4)) { /* Handle error i */ } /* Adds 4.44*row 4 to row 2 of matrix r */ ... } Possible Errors: NONE Matrix - 18 mat scalar mat_scalar(double * r, double * a, double x, short nrows, short ncols); mat_scalar_e(double * r, double * a, double x, short nrows, short ncols); mat_scalar_p(double * r, double * a, double x, short nrows, short ncols, short rsiz_c, short asiz_c); mat_scalar_p_e(double * r, double * a, double x, short nrows, short ncols, short rsiz_c, short asiz_c); Sets matrix r = scalar x times matrix a. Both r and a have nrows rows and ncols columns. Example: double a[6][7], r[6][7], x; short i; main() { /* initialize array a and scalar x */ if (i = mat_scalar_p_e(&r[0][0], &a[0][0], x, 6, 7, 7, 7)) { /* Handle error i here */ } /* r now = x * a */ } Possible Errors: NONE Matrix - 19 mat solve mat_solve(double * a, short n, short * pivots, double * x, double * b); mat_solve_e(double * a, short n, short * pivots, double * x, double * b); mat_solve_p(double * a, short n, short * pivots, double * x, double * b, short asiz_c); mat_solve_p_e(double * a, short n, short * pivots, double * x, double * b, short asiz_c); Solves the matrix equation Ax=b, where a is a square matrix with n rows and n columns, and x and b are vectors with n dimensions. The a input to this routine must have been run through mat_lud first, and the pivots array output by mat_lud must be preserved for use here. Example: double a[10][10], b[10], x[10], t[10]; short i, dsign, pivots[10]; struct tenbytes tb[10]; main() { ... /* setup data in 10 by 10 matrix a, and vector b */ if (i = mat_lud_p_e(&a[0][0], 10, pivots, &dsign, t, tb, 10)) { /* Handle error i */ } if (i = mat_solve_p_e(&a[0][0], 10, pivots, x, b, 10)) { /* Handle error i */ } /* x is now the solution of the matrix equation Ax = b */ } Possible Errors: NONE Matrix - 20 mat sub mat_sub(double * r, double * a, double * b, short nrows, short ncols); mat_sub_e(double * r, double * a, double * b, short nrows, short ncols); mat_sub_p(double * r, double * a, double * b, short nrows, short ncols, short rsiz_c, short asiz_c, short bsiz_c); mat_sub_p_e(double * r, double * a, double * b, short nrows, short ncols, short rsiz_c, short asiz_c, short bsiz_c); Sets the matrix r to the difference of matrix a and matrix b. Each matrix must contain the same number of rows and columns. Example: double r[5][5], a[5][5], b[5][5]; main() { ... if (i = mat_sub_p_e(&r[0][0], &a[0][0], &b[0][0], 5, 5, 5, 5, 5)) { /* Handle error i */ } /* Matrix r now equals matrix a - matrix b */ ... } Possible Errors: NONE Matrix - 21 mat trace mat_trace(double * t, double * a, short n); mat_trace_e(double * t, double * a, short n); mat_trace_p(double * t, double * a, short n, short asiz_c); mat_trace_p_e(double * t, double * a, short n, short asiz_c); Returns the trace of the n by n matrix a in t. Example: double t, a[6][6]; short i; main() { /* Initialize array a here */ if (i = mat_trace_p_e(&t, &a[0][0], 6, 6)) { /* Handle error i here */ } /* t is now equal to the trace of matrix a */ } Possible Errors: NONE Matrix - 22 mat transpose mat_transpose(double * r, short nrows, short ncols, double * a); mat_transpose_e(double * r, short nrows, short ncols, double * a); mat_transpose_p(double * r, short nrows, short ncols, double * a, short rsiz_c, short asiz_c); mat_transpose_p_e(double * r, short nrows, short ncols, double * a, short rsiz_c, short asiz_c); Sets matrix r to be the transpose of matrix a. Matrix a has nrows rows and ncols columns, while r has ncols rows and nrows columns. Example: double a[6][8], r[8][6]; short i; main() { /* Initialize 6 by 8 matrix a here */ if (i = mat_transpose_p_e(&r[0][0], 6, 8, &a[0][0], 6, 8)) { /* Handle error i here */ } /* r now = transpose(a) */ } Possible Errors: NONE Matrix - 23 mat zero mat_zero(double * r, short n); mat_zero_e(double * r, short n); mat_zero_p(double * r, short n, short rsiz_c); mat_zero_p_e(double * r, short n, short rsiz_c); Sets r to an zero matrix. r has n rows and n columns. Example: double r[10][10]; short i; main() { if (i = mat_zero_p_e(&r[0][0], 9, 10)) { /* Handle error i here */ } /* r now contains a 9 by 9 zero matrix. */ } Possible Errors: NONE Matrix - 24 POLYNOMIAL Functions The polynomial functions use the structure poly, which is defined in the VECPRO.H file. They also make use of the constant VP_POLYNOMIAL_MAX_ORDER. The default value for MAX_ORDER is 20, meaning each polynomial may contain up to the x20 term. If you want to define a different MAX_ORDER, 30, for example, add the following line to all your source files before including VECPRO.H: #define VP_POLYNOMIAL_MAX_ORDER 30 The structure poly is defined as follows: struct poly { short order; char waste[6]; double co[VP_POLYNOMIAL_MAX_ORDER+1]; } The waste array is there to be sure the coefficient array is aligned efficiently in memory. order contains the current order of the polynomial, and may range from 0 to VP_POLYNOMIAL_MAX_ORDER. co contains the coefficients of the polynomial: polynomial = co[0] + co[1]x + co[2]x2 etc. Note: For generality, the root finding routines work on complex polynomials. Use the poly_cpx_convert function to convert real polynomials to complex. poly_cpx_convert is documented in the complex polynomials chapter. Poly - 1 poly add poly_add(struct poly * r, struct poly * a, struct poly * b); poly_add_e(struct poly * r, struct poly * a, struct poly * b); Sets polynomial r = polynomial a + polynomial b. r.order becomes max(a.order, b.order). Example: struct poly r; struct poly a; struct poly b; short i; main() { /* Initialize polynomials a and b here */ if (i = poly_add_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a + b */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly - 2 poly copy poly_copy(struct poly * r, struct poly * a); poly_copy_e(struct poly * r, struct poly * a); Sets polynomial r = polynomial a. r.order becomes a.order. Example: struct poly r; struct poly a; short i; main() { /* Initialize polynomial a here */ if (i = poly_copy_e(&r, &a)) { /* Handle error i here */ } /* r = a */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly - 3 poly deriv poly_deriv(struct poly * r, struct poly * a); poly_deriv_e(struct poly * r, struct poly * a); Sets polynomial r = derivative(polynomial a). r.order becomes a.order - 1, unless a.order was already zero.. Example: struct poly r; struct poly a; short i; main() { /* Initialize polynomial a here */ if (i = poly_deriv_e(&r, &a)) { /* Handle error i here */ } /* r = derivative(a) */ } Possible errors: NONE Poly - 4 poly div poly_div(struct poly * r, struct poly * a, struct poly * b, struct poly * m); poly_div_e(struct poly * r, struct poly * a, struct poly * b, struct poly * m); Sets polynomial r = polynomial a / polynomial b. Any remainder is set into polynomial m. Example: struct poly r; struct poly a; struct poly b; struct poly m; short i; main() { /* Initialize polynomials a and b here */ if (i = poly_div_e(&r, &a, &b, &m)) { /* Handle error i here */ } /* r = a / b, remainder is in m */ } Possible errors: VP_DIV_BY_0 Poly - 5 poly eval poly_eval(double * r, double x, struct poly * a); poly_eval_e(double * r, double x, struct poly * a); Sets r = polynomial a(x). Example: double r, x; struct poly a; short i; main() { /* Initialize polynomial a and x here */ if (i = poly_eval_e(&r, x, &a)) { /* Handle error i here */ } /* r = a(x) */ } Possible errors: NONE Poly - 6 poly integ poly_integ(struct poly * r, struct poly * a); poly_integ_e(struct poly * r, struct poly * a); Sets polynomial r = integral(polynomial a). r.order becomes a.order + 1. Example: struct poly r; struct poly a; short i; main() { /* Initialize polynomials a and b here */ if (i = poly_integ_e(&r, &a)) { /* Handle error i here */ } /* r = integral(a) */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly - 7 poly lin change poly_lin_change(struct poly * r, struct poly * a, double ycoeff, double c, double * temp1, double * temp2); poly_lin_change_e(struct poly * r, struct poly * a, double ycoeff, double c, double * temp1, double * temp2); Makes a new polynomial r(y) = polynomial a(x), where x = ycoeff*y + c. r.order becomes a.order. temp1 and temp2 must point to arrays of complex with at least VP_POLYNOMIAL_MAX_ORDER elements. They are used as a work area. Example: struct poly r; struct poly a; double ycoeff, c; double temp1[VP_POLYNOMIAL_MAX_ORDER + 1]; double temp2[VP_POLYNOMIAL_MAX_ORDER + 1]; short i; main() { /* Initialize polynomial a, ycoeff, and c here */ if (i = poly_lin_change_e(&r, &a, ycoeff, c, temp1, temp2)) { /* Handle error i here */ } /* r(y) = a(x), where x = ycoeff*y + c */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. VP_PARAM_ERROR ycoeff was zero. Poly - 8 poly mult poly_mult(struct poly * r, struct poly * a, struct poly * b); poly_mult_e(struct poly * r, struct poly * a, struct poly * b); Sets polynomial r = polynomial a * polynomial b. r.order becomes a.order + b.order. Example: struct poly r; struct poly a; struct poly b; short i; main() { /* Initialize polynomials a and b here */ if (i = poly_mult_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a * b */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly - 9 poly sub poly_sub(struct poly * r, struct poly * a, struct poly * b); poly_sub_e(struct poly * r, struct poly * a, struct poly * b); Sets polynomial r = polynomial a - polynomial b. r.order becomes max(a.order, b.order). Example: struct poly r; struct poly a; struct poly b; short i; main() { /* Initialize polynomials a and b here */ if (i = poly_sub_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a - b */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly - 10 poly zero poly_zero(struct poly * r); poly_zero_e(struct poly * r); Sets polynomial r = 0. r.order becomes 0, co[0] becomes 0. Example: struct poly r; short i; main() { if (i = poly_zero_e(&r)) { /* Handle error i here */ } /* r = 0 */ } Possible errors: NONE Poly - 11 COMPLEX POLYNOMIAL Functions The complex polynomial functions use the structure polyx, which is defined in the VECPRO.H file. They also make use of the constant VP_POLYNOMIAL_MAX_ORDER. The default value for MAX_ORDER is 20, meaning each complex polynomial may contain up to the x20 term. If you want to define a different MAX_ORDER, 30, for example, add the following line to all your source files before including VECPRO.H: #define VP_POLYNOMIAL_MAX_ORDER 30 The structure polyx is defined as follows: struct polyx { short order; char waste[6]; struct complex co[VP_POLYNOMIAL_MAX_ORDER+1]; } The waste array is there to be sure the coefficient array is aligned efficiently in memory. order contains the current order of the complex polynomial, and may range from 0 to VP_POLYNOMIAL_MAX_ORDER. co contains the coefficients of the complex polynomial: complex polynomial = co[0] + co[1]x + co[2]x2 etc. Poly cpx - 1 poly cpx add poly_cpx_add(struct polyx * r, struct polyx * a, struct polyx * b); poly_cpx_add_e(struct polyx * r, struct polyx * a, struct polyx * b); Sets complex polynomial r = complex polynomial a + complex polynomial b. r.order becomes max(a.order, b.order). Example: struct polyx r; struct polyx a; struct polyx b; short i; main() { /* Initialize complex polynomials a and b here */ if (i = poly_cpx_add_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a + b */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly cpx - 2 poly cpx convert poly_cpx_convert(struct polyx * r, struct polyx* a); poly_cpx_convert_e(struct polyx * r, struct poly * a); Sets complex polynomial r = real polynomial a. r.order becomes a.order. Example: struct polyx r; struct poly a; short i; main() { /* Initialize polynomial a here */ if (i = poly_cpx_convert_e(&r, &a)) { /* Handle error i here */ } /* r = a */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly cpx - 3 poly cpx copy poly_cpx_copy(struct polyx * r, struct polyx * a); poly_cpx_copy_e(struct polyx * r, struct polyx * a); Sets complex polynomial r = complex polynomial a. r.order becomes a.order. Example: struct polyx r; struct polyx a; short i; main() { /* Initialize complex polynomial a here */ if (i = poly_cpx_copy_e(&r, &a)) { /* Handle error i here */ } /* r = a */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly cpx - 4 poly cpx deriv poly_cpx_deriv(struct polyx * r, struct polyx * a); poly_cpx_deriv_e(struct polyx * r, struct polyx * a); Sets complex polynomial r = derivative(complex polynomial a). r.order becomes a.order - 1, unless a.order was already zero.. Example: struct polyx r; struct polyx a; short i; main() { /* Initialize complex polynomial a here */ if (i = poly_cpx_deriv_e(&r, &a)) { /* Handle error i here */ } /* r = derivative(a) */ } Possible errors: NONE Poly cpx - 5 poly cpx div poly_cpx_div(struct polyx * r, struct polyx * a, struct polyx * b, struct polyx * m); poly_cpx_div_e(struct polyx * r, struct polyx * a, struct polyx * b, struct polyx * m); Sets complex polynomial r = complex polynomial a / complex polynomial b. Any remainder is set into complex polynomial m. Example: struct polyx r; struct polyx a; struct polyx b; struct polyx m; short i; main() { /* Initialize complex polynomials a and b here */ if (i = poly_cpx_div_e(&r, &a, &b, &m)) { /* Handle error i here */ } /* r = a / b, remainder is in m */ } Possible errors: VP_DIV_BY_0 Poly cpx - 6 poly cpx eval poly_cpx_eval(struct complex * r, struct complex * x, struct polyx * a); poly_cpx_eval_e(struct complex * r, struct complex * x, struct polyx * a); Sets r = complex polynomial a(x). Example: struct complex r, x; struct polyx a; short i; main() { /* Initialize complex polynomial a and x here */ if (i = poly_cpx_eval_e(&r, &x, &a)) { /* Handle error i here */ } /* r = a(x) */ } Possible errors: NONE Poly cpx - 7 poly cpx integ poly_cpx_integ(struct polyx * r, struct polyx * a); poly_cpx_integ_e(struct polyx * r, struct polyx * a); Sets complex polynomial r = integral(complex polynomial a). r.order becomes a.order + 1. Example: struct polyx r; struct polyx a; short i; main() { /* Initialize complex polynomials a and b here */ if (i = poly_cpx_integ_e(&r, &a)) { /* Handle error i here */ } /* r = integral(a) */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly cpx - 8 poly cpx lin change poly_cpx_lin_change(struct polyx * r, struct polyx * a, struct complex ycoeff, struct complex c, struct complex * temp1, struct complex * temp2); poly_cpx_lin_change_e(struct polyx * r, struct polyx * a, struct complex ycoeff, struct complex c, struct complex * temp1, struct complex * temp2); Makes a new complex polynomial r(y) = complex polynomial a(x), where x = ycoeff*y + c. r.order becomes a.order. temp1 and temp2 must point to arrays of complex with at least VP_POLYNOMIAL_MAX_ORDER elements. They are used as a work area. Example: struct polyx r; struct polyx a; struct complex ycoeff, c; struct complex temp1[VP_POLYNOMIAL_MAX_ORDER + 1]; struct complex temp2[VP_POLYNOMIAL_MAX_ORDER + 1]; short i; main() { /* Initialize complex polynomial a, ycoeff, and c here */ if (i = poly_cpx_lin_change_e(&r, &a, &ycoeff, &c, temp1, temp2)) { /* Handle error i here */ } /* r(y) = a(x), where x = ycoeff*y + c */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. VP_PARAM_ERROR ycoeff was zero. Poly cpx - 9 poly cpx mult poly_cpx_mult(struct polyx * r, struct polyx * a, struct polyx * b); poly_cpx_mult_e(struct polyx * r, struct polyx * a, struct polyx * b); Sets complex polynomial r = complex polynomial a * complex polynomial b. r.order becomes a.order + b.order. Example: struct polyx r; struct polyx a; struct polyx b; short i; main() { /* Initialize complex polynomials a and b here */ if (i = poly_cpx_mult_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a * b */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly cpx - 10 poly cpx root poly_cpx_root(struct complex * r, double * c, struct polyx * a, struct complex * guess); poly_cpx_root_e(struct complex * r, double * c, struct polyx * a, struct complex * guess); Finds one root of the polynomial a, and returns it in r. The search for the root is begun at guess. The condition number of the root is returned in c. It indicates the amount of error to be expected in the determination of the root. The smaller the condition number is, the more accurate the calculated root is. The condition number may range from one on up. If it gets above 1014, the calculated root is worthless. It should be noted that the condition number measures how sensitive the value of the calculated root is to error in the coefficient array, and thus is a property of the polynomial itself, not of the calculation method. If a large condition number is returned, you need to reformulate your problem to avoid it. Example: struct polyx a; struct complex r, guess; double c; short i; main() { /* Initialize complex polynomial a, and guess here. */ if (i = poly_cpx_root_e(&r, &c, &a, &guess)) { /* Handle error i here */ } /* We have now found a root: r */ } Possible Errors: VP_ILL_CONDITIONED c > 1010 VP_PARAM_ERROR a.order = 0 VP_NO_CONVERGENCE Lagurre's Method is used. Poly cpx - 11 poly cpx roots poly_cpx_root(struct complex * r, double * c, struct polyx * a, struct polyx * temp, struct polyx * temp2, struct polyx * temp3); poly_cpx_root_e(struct complex * r, double * c, struct polyx * a, struct polyx * temp, struct polyx * temp2, struct polyx * temp3); Finds all roots of the polynomial a, and returns them in r. r and c must point to arrays of at least a.order elements. The condition numbers of the roots are returned in c. See poly_cpx_root for a discussion of the condition number. temp, temp2, and temp3 are used as a work area. Example: struct polyx a, temp, temp2, temp3; struct complex r[VP_POLYNOMIAL_MAX_ORDER]; double c[VP_POLYNOMIAL_MAX_ORDER]; short i; main() { /* Initialize complex polynomial a here. */ if (i = poly_cpx_roots_e(r, c, &a, temp, temp2, temp3)) { /* Handle error i here */ } /* We have now found the roots of a */ } Possible Errors: VP_ILL_CONDITIONED some c[i] > 1010 VP_PARAM_ERROR a.order = 0 VP_NO_CONVERGENCE VP_ACC_PLOSS there was weak convergence Lagurre's Method with deflation and root polishing is used Poly cpx - 12 poly cpx sub poly_cpx_sub(struct polyx * r, struct polyx * a, struct polyx * b); poly_cpx_sub_e(struct polyx * r, struct polyx * a, struct polyx * b); Sets complex polynomial r = complex polynomial a - complex polynomial b. r.order becomes max(a.order, b.order). Example: struct polyx r; struct polyx a; struct polyx b; short i; main() { /* Initialize complex polynomials a and b here */ if (i = poly_cpx_sub_e(&r, &a, &b)) { /* Handle error i here */ } /* r = a - b */ } Possible errors: VP_OVERFLOW attempted to make r.order too large. Poly cpx - 13 poly cpx zero poly_cpx_zero(struct polyx * r); poly_cpx_zero_e(struct polyx * r); Sets complex polynomial r = 0. r.order becomes 0, co[0] becomes 0. Example: struct polyx r; short i; main() { if (i = poly_cpx_zero_e(&r)) { /* Handle error i here */ } /* r = 0 */ } Possible errors: NONE Poly cpx - 14 Sorting and Searching A sort routine, and two binary search routines are provided. They are intended to complement the standard routines of the C library, which are intended to operate on structures. The VectorPro routines operate on data in double arrays. Sort - 1 binary search binary_search(double * x, short n, double xi); binary_search_e(double * x, short n, double xi); Performs a binary search for the value xi on a sorted array of n doubles pointed to by x. The index of xi is returned as the function value. -1 is returned if xi could not be found. Example: double x[100]; double xi; short i; main() { /* Load and sort the array x, initialize xi */ if ((i = binary_search_e(x, 100, xi)) == -1) { /* Search failed. Handle the error. */ } /* i = the index of xi */ } Possible Errors: -1 Search failed Sort - 2 binary search i binary_search_i(double * x, short n, double xi); binary_search_i_e(double * x, short n, double xi); Performs a binary search for the value xi on a sorted array of n doubles pointed to by x. The index of xi is returned as the function value. If xi could not be found, but xi is within the table's range of values, the index of the largest element smaller than xi is returned. -1 is returned if xi is less than x[0] or greater than x[n-1]. binary_search_i is useful for certain situations where you want to know where xi would go if it was in the table. You may want to know where to add it, or find the closest value, etc. Example: double x[100], xi; short i; main() { /* Load and sort the array x, initialize xi */ if ((i = binary_search_i_e(x, 100, xi)) == -1) { /* Search failed. Handle the error. */ if (xi < x[0]) { /* xi is too small for this table */ } else { /* xi is too large for this table */ } } /* i = the index of xi */ if (xi == x[i]) { /* There was an exact match */ } else { /* xi goes between elements i and i+1 */ } } Possible Errors: -1 Search failed Sort - 3 heap sort heap_sort(double * x, short n, double ** others, short n_others); heap_sort_e(double * x, short n, double ** others, short n_others); heap_sort is used to sort data stored in arrays of double. To sort structures, use the library routine qsort. heap_sort will also, if you wish, rearrange other arrays to keep their elements associated with the corresponding elements of the sorted array. x is the array to sort. n is the number of elements in x. others is a pointer to an array of pointers containing n_others elements. Each array pointed to by others is also rearranged. Example: double x[100], o1[100], o2[100], o3[100]; double * others[3] = {o1, o2, o3}; short i; main() { /* Read in arrays of data x, o1, o2, o3 */ if (i = heap_sort_e(x, 100, others, 3)) { /* Handle error i here */ } /* The 4 columns of data are now sorted by column x */ } Possible errors: NONE Sort - 4 Trigonometry One trigonometry routine is provided: sincos. It exists because you often need both the sine and cosine of the same angle. sincos can calculate both at once almost as fast as the normal sin and cos functions can calculate a single one. Trig - 1 sincos sincos(double theta, double * s, double * c); sincos_e(double theta, double * s, double * c); sincos is used to calculate the sine and cosine of an angle, theta, simulataneously. It is provided simply for speed. Using sincos is almost twice as fast as making as call to sin() and then to cos(). The sine is returned via s and the cosine via c. Example: double c, s, theta, r, x, y; short i; main() { theta = 0.4; r = 10.0; if (i = sincos_e(theta, &s, &c)) { /* Handle error i here */ } x = c*r; y = s*r; /* Polar coordinates (r,theta) now converted to cartesian coordinates (x,y) */ } Possible Errors: NONE Trig - 2 VECTOR Operations Vectors are assumed to be stored in arrays of double, starting at element zero. Pass vectors to the functions by passing a pointer to double pointed at the zero element of the array. If you prefer, you may write your main program to begin at element one. If so, just pass a pointer to element one where a pointer to a vector is expected. Examples: Assuming the zero elements are used: double v[3]; short i; if ((i = vec_zero(v, 3)) != 0) { /* Recover from error i here ... */ } /* Vector v has now been set to zero. */ Assuming vectors begin with element one: double v[4]; short i; if ((i = vec_zero(&v[1], 3)) != 0) { /* Recover from error i here ... */ } /* Vector v has now been set to zero. */ The remaining examples in this chapter will assume that the zero element of all vectors is used. Unless otherwise specified, all routines accept vectors with any number of dimensions. > Vectors - 1 vec add vec_add(double * r, double * a, double * b, short n); vec_add_e(double * r, double * a, double * b, short n); Sets the vector r equal to the sum of vectors a and b. All vectors are assumed to have n elements. If the pointer r equals the pointer a or b, the function will still work correctly. Example: double a[3], b[3], r[3]; short i; ... if ((i = vec_add_e(r, a, b, 3) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 2 vec angle vec_angle(double * theta, double * a, double * b, short n); vec_angle_e(double * theta, double * a, double * b, short n); Sets the scalar *theta equal to the angle, in radians, between two vectors a and b. All vectors are assumed to have n elements. Example: double a[3], b[3], theta; short i; ... if ((i = vec_angle_e(theta, a, b, 3) != 0) { /* Recover from error i here */ } Possible errors: DIV_BY_ZERO Either a or b is a zero vector Vectors - 3 vec copy vec_copy(double * r, double * a, short n); vec_copy_e(double * r, double * a, short n); Sets the vector r equal to the vector a. All vectors are assumed to have n elements. Example: double a[3], r[3]; short i; ... if ((i = vec_copy_e(r, a, 3) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 4 vec crossprod vec_crossprod(double * r, double * a, double * b); vec_crossprod_e(double * r, double * a, double * b); Sets the vector r equal to the cross product of vectors a and b. All vectors are assumed to have 3 elements. Example: double a[3], b[3], r[3]; short i; ... if ((i = vec_crossprod_e(r, a, b) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 5 vec dotprod vec_dotprod(double * d, double * a, double * b, short n); vec_dotprod_e(double * d, double * a, double * b, short n); Sets the scalar *d equal to the dot product of vectors a and b. All vectors are assumed to have n elements. Example: double a[3], b[3], theta; short i; ... if ((i = vec_dotprod_e(&d, a, b, 3) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 6 vec gendotprod vec_gendotprod(double * d, char * a, char * b, short n, short a_inc, short b_inc); vec_gendotprod_e(double * d, char * a, char * b, short n, short a_inc, short b_inc); Sets the scalar *d equal to the dot product of vectors a and b. All vectors are assumed to have n elements. In this function, the vectors a and b are assumed to be elements of larger structures. The pointers passed are pointers to the vector elements in the first structure of each structure array. a_inc and b_inc are the sizes of the structures. They are needed to find the vector elements in the following structures. Example: struct my_data { short q; double number; char m[20]; } struct my_data a[50]; struct my_data b[50]; short i; double d; ... if ((i = vec_gendotprod_e(&d, (char *) a[0].number, (char *) b[0].number, 50, sizeof(struct my_data), sizeof(struct my_data)) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 7 vec magnitude vec_magnitude(double * mag, double * a, short n); vec_magnitude_e(double * mag, double * a, short n); Sets the scalar *mag equal to the magnitude of vector a. Vector a is assumed to have n elements. Example: double a[6], mag; short i; ... if ((i = vec_magnitude_e(mag, a, 6) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 8 vec scalar vec_scalar(double * r, double * a, double * x, short n); vec_scalar_e(double * r, double * a, double * x, short n); Sets the vector r equal to the product of scalar x times vector a. All vectors are assumed to have n elements. If the pointer r equals the pointer a, the function will still work correctly. Example: double a[4], r[4]; short i; ... if ((i = vec_scalar_e(r, a, 77.2, 4) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 9 vec sub vec_sub(double * r, double * a, double * b, short n); vec_sub_e(double * r, double * a, double * b, short n); Sets the vector r equal to vector a minus vector b. All vectors are assumed to have n elements. If the pointer r equals the pointer a or b, the function will still work correctly. Example: double a[3], b[3], r[3]; short i; ... if ((i = vec_sub_e(r, a, b, 3) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 10 vec total vec_total(double * t, short n, short n_inputs, double * a, double a_factor, ...); vec_total_e(double * t, short n, short n_inputs, double * a, double a_factor, ...); Adds up a list of up to six vectors times a scalar factor, and adds it all to the vector t. Each vector has n elements. n_inputs may range from 1 to 6. t = t + a_factor*a [+b_factor*b + ... + f_factor*f] The function is limited to six input vectors because of the methods used in optimizing the assembler version of the function. If you need to add more than six vectors, use several calls to vec_total. Example: double p[3], v[3], a[3]; short i; ... if ((i = vec_total_e(p, 3, 2, v, .01, a, .00005) != 0) { /* Recover from error i here */ } /* p = p + vt + at2/2 where t = 0.01 */ Possible errors: PARAM_ERROR n_inputs was out of range Vectors - 11 vec unit vec_unit(double * r, double * a, short n); vec_unit_e(double * r, double * a, short n); Sets the vector r equal to the unit vector parallel to vector a. All vectors are assumed to have n elements. Example: double a[3], r[3]; short i; ... if ((i = vec_unit_e(r, a, 3) != 0) { /* Recover from error i here */ } Possible errors: DIV_BY_ZERO vector a was all zeroes Vectors - 12 vec zero vec_zero(double * r, short n); vec_zero_e(double * r, short n); Sets the vector r equal to zero. r is assumed to have n elements. Example: double r[10]; short i; ... if ((i = vec_zero_e(r, 10) != 0) { /* Recover from error i here */ } Possible errors: NONE Vectors - 13